Next-Generation Morphometry for pathomics-data mining in histopathology

Pathology diagnostics relies on the assessment of morphology by trained experts, which remains subjective and qualitative. Here we developed a framework for large-scale histomorphometry (FLASH) performing deep learning-based semantic segmentation and subsequent large-scale extraction of interpretable, quantitative, morphometric features in non-tumour kidney histology. We use two internal and three external, multi-centre cohorts to analyse over 1000 kidney biopsies and nephrectomies. By associating morphometric features with clinical parameters, we confirm previous concepts and reveal unexpected relations. We show that the extracted features are independent predictors of long-term clinical outcomes in IgA-nephropathy. We introduce single-structure morphometric analysis by applying techniques from single-cell transcriptomics, identifying distinct glomerular populations and morphometric phenotypes along a trajectory of disease progression. Our study provides a concept for Next-generation Morphometry (NGM), enabling comprehensive quantitative pathology data mining, i.e., pathomics.


Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf The pathomics data, associated clinical data and many segmentation images (>2000 paired image patches (PAS plus segmentation)) generated in this study have been deposited in our github repository: https://git-ce.rwth-aachen.de/labooratory-ai/flash. The raw whole slide image data are available under restricted access for privacy protection reasons, access can be obtained by directly contacting Peter Boor, Institute of Pathology, RWTH Aachen University Clinic, Aachen, Germany, pboor@ukaachen.de (for the AC_B and AC_N datasets) or Rosanna Coppo, Fondazione Ricerca Molinette, Torino, Italy, rosanna.coppo@unito.it (for the VALIGA dataset). In general, the requests will be evaluated within 4 weeks based on institutional and trial policies. Data can only be shared for non-commercial research purposes and requires a data transfer agreement. The aggregated data and raw data used to create figure panels generated in this study are provided in the Supplementary Information/Source Data files. The public external image and clinical data used in this study are available in the KPMP (atlas.kpmp.org/repository) and HubMAP (portal.hubmapconsortium.org) databases.

Code availability statement
The source code for FLASH and instructions on how to use it are are freely available at: git-ce.rwth-aachen.de/labooratory-ai/flash.
All clinical data in our study only contains information regarding the sex of the patients as gathered within the pathology information system and we do not refer to the patients gender in the manuscript. We have provided clinical data regarding the distribution of sex in the different cohorts and experiments in the supplementary information.
Two internal, single centre (Aachen Biopsy & Aachen Nephrectomy, AC_B & AC_N), and three external, multi-centre cohorts (HubMAP, KPMP, VALIGA) of kidney biopsies and nephrectomies were included. The two largest cohorts in this study are AC_B and VALIGA, covering approximately 92% of total cases used. Demographic and clinical characteristics between cohorts were comparable, apart from younger patients and more males in the VALIGA cohort, as well as reduced kidney function assessed by estimated glomerular filtration rate (eGFR), which was more common in the AC_B cohort and a higher prevalence of hypertension in the AC_N cohort. . From the initial VALIGA trial cohort, 768 cases could be identified and digitised (scanned). Overall, 106 cases were excluded. An additional 14 cases were excluded on slide level due to artefacts, with in total, 648 PAS-stained WSIs of 648 cases being included. Patients from the five cohorts were recruited retrospectively and refined based on defined exclusion criteria (Life sciences study design: Data exclusions). Exclusion criteria were solely based on ensuring high quality of the kidney specimen and histological slides.
Data collection and analysis in this study was performed in accordance with the Declaration of Helsinki and was approved by the local ethics committee of the RWTH Aachen University (EK-No. 315/19). All analyses were performed retrospectively in an anonymous fashion and the need for informed consent was waived by the local ethics and privacy committee for all datasets.